Dialect variation in Boro Language and Grapheme-to-Phoneme conversion rules to handle lexical lookup fails in Boro TTS System
نویسنده
چکیده
It is not possible to include all the words in a natural language for general text-to-speech system. Grapheme-tophoneme conversion system is essential to pronounce a word which is out of vocabulary. Grapheme-to-phoneme rules play a vital role where lexical lookup fails. Though basic Grapheme-tophoneme rules system is very simple yet it is very powerful for naturalness of a TTS system. Letter-to-sound rules may be hand written or maybe automatic depending on the language. We have worked on Bodo language. After a systematic study of Boro language we found that there is a systematic relationship between the written form of a Bodo word and its pronunciation. So, it is fairly easy to build letter-to-sound rules by hand for Bodo language. We have used a Bodo corpora of 5000 words and built letter-to-sound rules. These rules have been tested using Festival, a most popular speech synthesizer and applying these rules, we were able to produce correct pronunciations for approximately 89% of the words. Again, dialect variation also influences grapheme-to-phoneme conversion rules. This paper gives overview of Boro dialect variation and grapheme-tophoneme conversion rules developed for Boro TTS system.
منابع مشابه
The Festvox Indic Frontend for Grapheme-to-Phoneme Conversion
Text-to-Speech (TTS) systems convert text into phonetic pronunciations which are then processed by Acoustic Models. TTS frontends typically include text processing, lexical lookup and Grapheme-to-Phoneme (g2p) conversion stages. This paper describes the design and implementation of the Indic frontend, which provides explicit support for many major Indian languages, along with a unified framewor...
متن کاملPhonology of Exceptions For Korean Grapheme-to-Phoneme Conversion
Being an essential part of a Korean speech recognition system and a Text-To-Speech (TTS) system, a Korean Grapheme-toPhoneme conversion system is generally composed of a set of regular rules and an exceptions dictionary [1, 2, 3]. The exceptions have been recorded in the dictionary in a simple and random manner, whereas the researches on the regular rules have been actively progressed. This pap...
متن کاملGrapheme to phoneme conversion: an Arabic dialect case
We aim to develop a Speech-to-Speech translation system between Modern Standard Arabic and Algiers dialect. Such a system must include a Text-to-Speech module which itself must include a Grapheme-to-Phoneme converter. Algiers dialect is an Arabic dialect concerned by the most problems of Modern Standard Arabic in NLP area. Furthermore, it could be considered as an under-resourced language becau...
متن کاملPhonology of exceptions for for Korean grapheme-to-phoneme conversion
Being an essential part of a Korean speech recognition system and a Text-To-Speech (TTS) system, a Korean Grapheme-toPhoneme conversion system is generally composed of a set of regular rules and an exceptions dictionary [1, 2, 3]. The exceptions have been recorded in the dictionary in a simple and random manner, whereas the researches on the regular rules have been actively progressed. This pap...
متن کاملTonal Alignment and Prosodic Word domain in Boro
This paper discusses the way morphological factors influence the distribution of lexical tones in Boro. Another objective of this paper is to unravel the underlying tonal nature of various prefixes and suffixes that participate in these processes. Boro is a tone language belonging to the Tibeto-Burman group. The language lexically distinguishes L and H tones. The TBU in Boro is the syllable and...
متن کامل